Salmonid Mortality Data

View data on TidyTuesday

This week is Norwegian salmonid mortality data reported by aquaculture companies, including salmon and rainbow trout. I decided to limit the focus to salmon. There were two datasets—mortality, based on the average death-to-live-fish ratio, turned into a risk factor; and losses in absolute terms, broken down into categories including dead fish physically removed, fish rejected and discarded at the slaughterhouse, and other losses to predators, theft, etc.

Dead ReckoningHistorical monthly salmon mortality (2020–2025)
2.001.000.00Mortality Risk (%)
2020202120222023202420252026
1st QuartileMedian3rd Quartile
Deadliest MonthTotal salmon losses by month of the year (2020–2025)
40.0M20.0M0Losses
JanFebMarAprMayJunJulAugSepOctNovDec
DeadDiscardedOther

The Queries

I learned something new the other night.

I knew DuckDB lets you skip select * and write a query starting with from that would automatically select all columns. What I didn’t realize was that you can actually flip the select and from clauses—so you can start with the table or view you’re querying, and then start selecting columns. I’ve long since developed a habit of leaving a blank line, typing from table_name, any joins, and then going back up to fill in my select clause now that autocomplete has a scope to suggest columns from. So being able to lead with the table makes so much sense. These are the things that make DuckDB a joy to use. (I’m going to miss this feature so much in Snowflake now.)

Anyway, getting to the actual queries—there was pretty much nothing to do for the first chart except filter the data I wanted:

   from monthly_mortality_data

 select date,
        q1      as "1st Quartile",
        median  as "Median",
        q3      as "2nd Quartile"

  where geo_group = 'country'
    and species = 'salmon'

For the second one, I did some summing and extracting month names and numbers for sorting:

with losses as (
       from monthly_losses_data

     select strftime(date, '%b')  as month,
            month(date)           as num_month,
            sum(dead)             as Dead,
            sum(discarded)        as Discarded,
            sum(other)            as Other

      where geo_group = 'country'
        and species = 'salmon'

   group by all
   order by num_month
)

from losses select * exclude num_month