Skip to contents

How to Prepare Data

Release

  1. Read data.

    Load the source release data into memory with read.csv or some other reader functions. The source data set used here is from .

    release_custom <- read.csv("./sample_releases.csv")
  2. Process coded-wire tag information.

    We now start to find the total number number of releases and the production expansion factor using the source data. Production expansion factor is defined as the number of tagged fish with clipped adipose fin / the number of total releases.

    release_custom$cwt_1st_mark_count[is.na(release_custom$cwt_1st_mark_count)] <- 0
    release_custom$cwt_2nd_mark_count[is.na(release_custom$cwt_2nd_mark_count)] <- 0
    release_custom$non_cwt_1st_mark_count[is.na(release_custom$non_cwt_1st_mark_count)] <- 0
    release_custom$non_cwt_2nd_mark_count[is.na(release_custom$non_cwt_2nd_mark_count)] <- 0
    
    release_custom$Total_Released <- (release_custom$cwt_1st_mark_count
                               + release_custom$cwt_2nd_mark_count
                               + release_custom$non_cwt_1st_mark_count
                               + release_custom$non_cwt_2nd_mark_count)
    
    release_custom$prod_exp <- release_custom$cwt_1st_mark_count / release_custom$Total_Released
    
    names(release_custom)[7] <- "tag_code"
  3. Extract time and produce the final data frame

    release_custom <- release_custom |> 
      dplyr::mutate(release_month = lubridate::month(lubridate::ymd(last_release_date)),
             total_release = Total_Released) |> 
      dplyr::select(release_month,
             brood_year,
             tag_code,
             prod_exp,
             total_release)
  4. Here are the required column names.

    release_custom |> colnames()
    #> [1] "release_month" "brood_year"    "tag_code"      "prod_exp"     
    #> [5] "total_release"
  5. Here is the final result.

    release_custom |> head()
    #>   release_month brood_year tag_code  prod_exp total_release
    #> 1             6       2007    67001 0.2490268       1253793
    #> 2             6       2007    68602 0.2499226       1779855
    #> 3             6       2007    68603 0.2481118       1860859
    #> 4             5       2008    68640 0.9778000        270000
    #> 5             6       2008    68623 0.2484728        141110
    #> 6             6       2008    68638 0.2498524       1601137

Recovery

Once we have the release data frame under our belt, we can use it to construct the recovery data frame. Here is an example of how to create it.

  1. Read relevant data

    On top of the release data frame, we also need several other sources of information. We need coded-wire tag recovery information, which contains when and where a tag was recovered. Then, we need the site code data frame to translate the recovery site id into human readable regions. Lastly, we need to know the minimum harvest size limit in inches at the time and location that the tag was recovered. Note that if we have all these pieces of information in one data frame, we do not have to combine these three data frames.

    recovery_custom = read.csv("./sample_recoveries.csv")
    site_code = read.csv("./sample_site_code.csv")
    size_limit = read.csv("./sample_size_limits.csv")
  2. Combine relevant data frames

    site_code <- site_code |>
      dplyr::mutate(sampling_site = sampsite,
             location = area.1) |>
      dplyr::select(sampling_site, location, agency)
    
    size_limit <- size_limit |>
      dplyr::mutate(location = Location,
             month = Month,
             size_limit = limit) |>
      dplyr::select(-c('Month', 'Location', 'limit'))
    
    recovery_custom <- recovery_custom |> 
      dplyr::mutate(month = lubridate::month(lubridate::ymd(recovery_date))) |> 
      dplyr::select(run_year,
             recovery_id,
             fishery,
             tag_code,
             sex,
             month,
             sampling_site,
             recovery_location_code,
             sampling_site,
             reporting_agency,
             estimated_number,
             length) |> 
      dplyr::left_join(release, by = "tag_code") |> 
      dplyr::select(-c("release_month",
                "prod_exp",
                "total_release")) |> 
      dplyr::left_join(site_code, by = c("sampling_site" = "sampling_site",
                                  "reporting_agency" = "agency")) |> 
      dplyr::filter(fishery %in% c(10, 40, 54, 50, 46)) |>
      dplyr::mutate(est_num = estimated_number) |>
      dplyr::left_join(size_limit, by = c('run_year', 'fishery', 'location', 'month')) |>
      dplyr::select("run_year",
             "fishery",
             "tag_code",
             "length",
             "sex",
             "month",
             "location",
             "size_limit",
             "est_num")
  3. Here are the required column names.

    recovery_custom |> colnames()
    #> [1] "run_year"   "fishery"    "tag_code"   "length"     "sex"       
    #> [6] "month"      "location"   "size_limit" "est_num"
  4. Here is the final result.

    recovery_custom |> head()
    #>   run_year fishery tag_code length sex month location size_limit est_num
    #> 1     2009      50    67001    496   M    11     <NA>         NA    1.01
    #> 2     2009      50    67001    497   M    11     <NA>         NA    1.01
    #> 3     2009      50    67001    410   M    11     <NA>         NA    1.00
    #> 4     2009      50    67001    505   M    11     <NA>         NA    1.00
    #> 5     2010      10    67001    670         7       FB         27    1.97
    #> 6     2010      10    67001    730         7       FB         27    1.57

Survival

The age-specific survival rate is the ratio between the survived individuals at the end of an age and the total individuals at the beginning of an age. Typically, the survival rate for age-2 cohort is 0.5, while the rate for age-3 and up is 0.8.

Here is how to create a custom survival rate input.

survival_custom = data.frame(age = 2:6, rate = c(0.5, rep(0.8, times = 4)))

survival_custom
#>   age rate
#> 1   2  0.5
#> 2   3  0.8
#> 3   4  0.8
#> 4   5  0.8
#> 5   6  0.8

Fisheries

The package requires a mapping between each type of fishery and its corresponding identifier.

Here is how to create a custom fisheries-id mapping.

list(oc_rec = 40,
     oc_com = 10,
     esc_sp = 54,
     esc_hat = 50,
     riv_harv = 46)
#> $oc_rec
#> [1] 40
#> 
#> $oc_com
#> [1] 10
#> 
#> $esc_sp
#> [1] 54
#> 
#> $esc_hat
#> [1] 50
#> 
#> $riv_harv
#> [1] 46

Hook-and-Release Mortality

Hook-and-release mortality rate is the proportion of fish that are The default data frame is included in the package and named release_mort. Here is how to create a custom hook-and-release mortality data frame.

release_mort_custom <- data.frame(run_year = rep(1981, 20), 
                           fishery = c(rep(40, 12), rep(10, 7), 40), 
                           location = c('FB', 'KC', 'MO', 'SF', 'FB', 'KC', 'MO', 'SF', 'FB',
                                        'KC', 'MO', 'SF', 'CO', 'FB', 'KC', 'KO', 'MO', 'NO',
                                        'SF', 'CO'), 
                           month = c(rep(c(2, 3, 4), each = 4), rep(5, 7), 5), 
                           rate = c(rep(0.14, 12), rep(0.26, 7), 0.14))
release_mort_custom
#>    run_year fishery location month rate
#> 1      1981      40       FB     2 0.14
#> 2      1981      40       KC     2 0.14
#> 3      1981      40       MO     2 0.14
#> 4      1981      40       SF     2 0.14
#> 5      1981      40       FB     3 0.14
#> 6      1981      40       KC     3 0.14
#> 7      1981      40       MO     3 0.14
#> 8      1981      40       SF     3 0.14
#> 9      1981      40       FB     4 0.14
#> 10     1981      40       KC     4 0.14
#> 11     1981      40       MO     4 0.14
#> 12     1981      40       SF     4 0.14
#> 13     1981      10       CO     5 0.26
#> 14     1981      10       FB     5 0.26
#> 15     1981      10       KC     5 0.26
#> 16     1981      10       KO     5 0.26
#> 17     1981      10       MO     5 0.26
#> 18     1981      10       NO     5 0.26
#> 19     1981      10       SF     5 0.26
#> 20     1981      40       CO     5 0.14

Drop Off Mortality

Drop off mortality is the proportion of fish encountered by the gear that is killed without being brought to the vessel intact. The default value is 0.05 (Salmon Technical Team 2021).

Length-at-age

Length-at-age is a data frame encoding the mean and standard deviation total length at each age-month.

Here is how to create a custom length-at-age data frame.

length_at_age_custom <- data.frame(
  age = c(rep(1, 10), rep(2, 12)),
  month = c(3:12, 1:12),
  mean = c(rep(19.06998, 14), 21.16612, 22.17341, 
           22.88485, 24.27430, 24.77973, 23.88669, 24.82676, 25.94237),
  sd = c(rep(2.151048, 14), 2.272190, 2.250629, 2.393862, 2.191818, 
         2.098376, 2.129863, 1.940248, 1.985157)
)

length_at_age_custom
#>    age month     mean       sd
#> 1    1     3 19.06998 2.151048
#> 2    1     4 19.06998 2.151048
#> 3    1     5 19.06998 2.151048
#> 4    1     6 19.06998 2.151048
#> 5    1     7 19.06998 2.151048
#> 6    1     8 19.06998 2.151048
#> 7    1     9 19.06998 2.151048
#> 8    1    10 19.06998 2.151048
#> 9    1    11 19.06998 2.151048
#> 10   1    12 19.06998 2.151048
#> 11   2     1 19.06998 2.151048
#> 12   2     2 19.06998 2.151048
#> 13   2     3 19.06998 2.151048
#> 14   2     4 19.06998 2.151048
#> 15   2     5 21.16612 2.272190
#> 16   2     6 22.17341 2.250629
#> 17   2     7 22.88485 2.393862
#> 18   2     8 24.27430 2.191818
#> 19   2     9 24.77973 2.098376
#> 20   2    10 23.88669 2.129863
#> 21   2    11 24.82676 1.940248
#> 22   2    12 25.94237 1.985157

How to Use the Function

Bootstrap or Point Estimate?

The package uses either bootstrapped data or point estimates. Bootstrapped estimations allow us to approximate the spread of each parameters through simulated uncertainties, while point estimates do not as they are deterministic. Simply put, bootstrapped estimations offer more information than point estimates. However, it takes significantly more computing cycles to find bootstrapped estimations than to find point estimates.

Bootstrapped Estimation

We model the number of fish missed by the sampling effort with an negative binomial distribution. Assuming the negative binomial distribution models the number of failures before the kth success, the number of fish that would have been recovered is 1+rnbinom(1,1/est_num)1 + rnbinom(1, 1/est\_num) where est_num is the number of fish estimated to be recovered by one tag (point estimate).

Bootstrap with Detail
result = cohort_reconstruct(rel = release, reco = recovery,
                            birth_month = 6L, last_month = 12L,
                            bootstrap = TRUE, iter = 10L, 
                            detail = TRUE, verbose = FALSE)

result[["2007"]][["2"]][["1"]][["data"]]
#>                         1          2          3          4          5
#> ocean_abundance 11191.952 11456.1228 11815.3818 11212.6189 10826.8287
#> impact              0.000     0.0000     0.0000     0.0000     0.0000
#> maturation          0.000     0.0000     0.0000     0.0000     0.0000
#> natural_mort      628.156   642.9828   663.1464   629.3159   607.6632
#>                          6         7          8          9         10
#> ocean_abundance 12116.2684 10582.802 12128.2769 11714.1797 11530.4519
#> impact              0.0000     0.000     0.0000     0.0000     0.0000
#> maturation          0.0000     0.000     0.0000     0.0000     0.0000
#> natural_mort      680.0339   593.967   680.7079   657.4664   647.1545
result[["2007"]][["2"]][["1"]][["summary"]]
#>                     median        sd  CrI_low   CrI_high
#> ocean_abundance 11493.2873 513.86663 11476.56 11510.0114
#> impact              0.0000   0.00000     0.00     0.0000
#> maturation          0.0000   0.00000     0.00     0.0000
#> natural_mort      645.0687  28.84112   644.13   646.0073
Bootstrap Without Detail
result = cohort_reconstruct(rel = release, reco = recovery,
                            birth_month = 6L, last_month = 12L,
                            bootstrap = TRUE, iter = 10L, 
                            detail = FALSE, verbose = FALSE)

result[["2007"]][["2"]][["1"]][["data"]]
#>                        1        2        3        4        5        6        7
#> ocean_abundance 11189.24 11446.14 11751.16 11687.43 11267.15 11400.68 11667.22
#>                        8        9       10
#> ocean_abundance 11738.19 11158.83 10799.96
result[["2007"]][["2"]][["1"]][["summary"]]
#>                   median      sd  CrI_low CrI_high
#> ocean_abundance 11423.41 311.448 11413.18 11433.64
Point Estimate with Detail
result = cohort_reconstruct(rel = release, reco = recovery,
                            birth_month = 6L, last_month = 12L,
                            bootstrap = FALSE, iter = 10L, 
                            detail = TRUE, verbose = FALSE)

result[["2007"]][["2"]][["1"]][["data"]]
#>                      value
#> ocean_abundance 11431.3501
#> impact              0.0000
#> maturation          0.0000
#> natural_mort      641.5924
result[["2007"]][["2"]][["1"]][["summary"]]
#>                      value
#> ocean_abundance 11431.3501
#> impact              0.0000
#> maturation          0.0000
#> natural_mort      641.5924
Point Estimate without Detail
result = cohort_reconstruct(rel = release, reco = recovery,
                            birth_month = 6L, last_month = 12L,
                            bootstrap = FALSE, iter = 10L, 
                            detail = FALSE, verbose = FALSE)

result[result$by == 2007 & result$age == 2 & result$month == 1, ]
#>       by   age month ocean_abundance
#>    <num> <num> <num>           <num>
#> 1:  2007     2     1        11431.35

Dissecting the Return Value

There are two types of return value, depending on the values of detail and bootstrap. In the case when both detail and bootstrap are FALSE, the return value is a data table of the following form:

| by (integer) | age (integer) | month (integer) | ocean_abundance (double) |
|--------------|---------------|-----------------|--------------------------|
| 1995         | 3             | 5               | 3.814342                 |
| 1995         | 3             | 4               | 10.853284                |
| 1995         | 3             | 3               | 11.056992                |
| 1995         | 3             | 2               | 11.264524                |
| 1995         | 3             | 1               | 11.475951                |
| 1995         | 3             | 12              | 11.691346                |
| 1995         | 3             | 11              | 11.910784                |
| 1995         | 3             | 10              | 12.134340                |
| 1995         | 3             | 9               | 12.362093                |
| 1995         | 3             | 8               | 12.594120                |

In the cases when either detail, bootstrap, or both are set to TRUE, the return value is a three dimensional list. The first dimension encodes brood year information (by), the second age, and the third month. Each age has an age-specific summary information. Each month has data and month-specific summary information. Depending on detail and bootstrap, the corresponding output will vary.

Here is the structure of the return value:

  "by" "age" "month"

├── [["2002"]]
│   ├──[["2"]]
│   │   ├── [["1"]]
│   │   │   ├── `data`
│   │   │   └── `summary`
│   │   ├── ...
│   │   │
│   │   ├── [["age_summary"]]
│   │   │   ├── `data`
│   │       └── `summary`
│   ├── [["3"]]
│   │   ├── [["1"]]
│   │   │   ├── `data`
│   │   │   └── `summary`
│   │   ├── ...
│   │   │
│   │   ├── [["age_summary"]]
│   │   │   ├── `data`
│   │       └── `summary`
│   ├── ...
│   │
│   ├──[["by_summary"]]
│   │  ├── `data`
│   │  └── `summary`
├── ...

Summary

  • Parameter: the statistics in question.
  • Median: the median for the parameter.
  • SD: the standard deviation for the parameter.
  • CrI_low: the lower bound of the credible interval.
  • CrI_high: the upper bound of the credible interval.
Brood-year-specific summary (under by_summary)
  • srr: spawner reduction rate, or adult equivalent exploitation rate, is the “reduction in a brood’s potential adult spawning escapement owing to ocean fisheries, relative to its escapement potential in the absence of ocean fishing” (O’Farrell et al. 2012):
  • s1: early life survival rate is the proportion of the abundance of fish first month in ocean, relative to the total release number (Chen et al.,).
| Parameter | Median  | SD      | CrI_low | CrI_high |
|-----------|---------|---------|---------|----------|
| srr       | 0.18061 | 0.00871 | 0.17996 | 0.18126  |
| s1        | 0.00519 | 0.00008 | 0.00518 | 0.00521  |
Age-specific summary (under age_summary)
  • imp_rate: age-specific fishery impact rate.
  • mat_rate: age-specific maturation rate.
| Parameter | Median  | SD      | CrI_low | CrI_high |
|-----------|---------|---------|---------|----------|
| imp_rate  | 0.20872 | 0.00730 | 0.20815 | 0.20930  |
| mat_rate  | 0.63548 | 0.01462 | 0.63546 | 0.63550  |
Month-specific summary (under each month)
  • ocean_abundance: number of individuals in the ocean at that time.
  • impact: mortality due to fishing.
  • maturation: number of spawners.
  • natural_mort: mortality due to natural causes.
| Parameter        | Median      | SD       | CrI_low     | CrI_high    |
|------------------|-------------|----------|-------------|-------------|
| ocean_abundance  | 33954.6031  | 497.3167 | 33871.9672  | 34037.2390  |
| impact           | 0.0000      | 0.0000   | 0.0000      | 0.0000      |
| maturation       | 0.0000      | 0.0000   | 0.0000      | 0.0000      |
| natural_mort     | 625.5616    | 9.1623   | 624.0392    | 627.0841    |

Data

Each labeled entry is one bootstrapped iteration, where the label n corresponds to the nth iteration.

Brood-year-specific data
| Measurement | 1       | 2       | 3       |
|-------------|---------|---------|---------|
| srr         | 0.18414 | 0.17711 | 0.19864 | 
| s1          | 0.00526 | 0.00514 | 0.00515 |
Age-specific data
| Parameter | 1       | 2       | 3       |
|-----------|---------|---------|---------|
| imp_rate  | 0.20745 | 0.20182 | 0.22336 |
| mat_rate  | 0.63768 | 0.65303 | 0.61126 |
Month-specific data
| Parameter        | 1         | 2         | 3         |
|------------------|-----------|-----------|-----------|
| ocean_abundance  | 34138.2384| 33660.7294| 33638.2628|
| impact           | 0.0000    | 0.0000    | 0.0000    |
| maturation       | 0.0000    | 0.0000    | 0.0000    |
| natural_mort     | 628.9448  | 620.1474  | 619.7335  |

References

Chen EK, Satterthwaite WH, O’Farrell MR, Carlson SM (????). “Incorporating age structure into the assessment and forecasting of Sacramento River Fall Chinook salmon (Oncorhynchus tshawytscha).” In preparation.

Salmon Technical Team (2000). “STT Recommendations for Hooking Mortality Rates in 2000 Recreational Ocean Chinook and Coho Fisheries.” Pacific Fishery Management Council, Portland. https://www.pcouncil.org/documents/2021/05/stt-recommendations-for-hooking-mortality-rates-in-2000-recreational-ocean-chinook-and-coho-fisheries.pdf/.