Brazilian Companies

View data on TidyTuesday

This dataset lists companies in Brazil with their legal nature (sole proprietorship, publicly traded corporation etc.), size (small enterprise, micro enterprise, or other), and qualifications of the owner, as well as their capital stock.

I thought looking at the breakdown of size and type was interesting, as was a comparison by number of companies vs. capital making up each grouping.

Listing all the legal entities in the data overwhelmed the visual a bit, so I did some categorizing myself—all limited liability corporations, all types of partnerships, all co-ops, and then putting a bunch in an “Other” bucket.

Brazilian BreakdownNumber of companies by size and legal category
Brazil → Micro enterprise: 66,202Brazil → Other: 42,520Brazil → Small enterprise: 32,610Micro enterprise → LLC: 51,699Other → LLC: 36,241Small enterprise → LLC: 31,531Micro enterprise → Sole Proprietorship: 14,162Other → Partnership: 2,939Other → Privately Held Corporation: 2,892Small enterprise → Sole Proprietorship: 920Other → Other Type: 175Micro enterprise → Partnership: 172Micro enterprise → Other Type: 166Small enterprise → Partnership: 150Other → Sole Proprietorship: 127Other → Cooperative: 79Other → Publicly Traded Corporation: 52Other → State-Owned Enterprise: 15Small enterprise → Other Type: 7Micro enterprise → Privately Held Corporation: 3Small enterprise → Privately Held Corporation: 2
Brazil
Micro enterprise
Other
Small enterprise
LLC
Sole Proprietorship
Partnership
Privately Held Corporation
Other Type
Cooperative
Publicly Traded Corporation
State-Owned Enterprise
↓ Download data
Brazilian Breakdown (Capital)Combined capital (BRL) of companies by size and legal category
Brazilian capital → Small enterprise: R$ 27.3TBrazilian capital → Other: R$ 21.3TBrazilian capital → Micro enterprise: R$ 1.4TSmall enterprise → LLC: R$ 27.3TOther → LLC: R$ 20.6TMicro enterprise → LLC: R$ 1.2TOther → Privately Held Corporation: R$ 487.6BMicro enterprise → Sole Proprietorship: R$ 172.6BOther → Publicly Traded Corporation: R$ 164.5BMicro enterprise → Other Type: R$ 34.5BOther → Partnership: R$ 34.1BOther → Sole Proprietorship: R$ 6.2BOther → State-Owned Enterprise: R$ 5.4BOther → Other Type: R$ 3.8BOther → Cooperative: R$ 1.8BMicro enterprise → Partnership: R$ 700.7MMicro enterprise → Privately Held Corporation: R$ 565.8MSmall enterprise → Sole Proprietorship: R$ 410.2MSmall enterprise → Partnership: R$ 176.3MSmall enterprise → Other Type: R$ 2.3MSmall enterprise → Privately Held Corporation: R$ 474.0K
Brazilian capital
Small enterprise
Other
Micro enterprise
LLC
Privately Held Corporation
Sole Proprietorship
Publicly Traded Corporation
Other Type
Partnership
State-Owned Enterprise
Cooperative
↓ Download data

Queries

This was my staging table of sorts, where I pulled in the CSV, formatted company_size, and added a column categorizing legal_nature.

create or replace table companies as

 select * replace (replace(upper(company_size[1])||company_size[2:], '-', ' ') as company_size),
                   -- Turn `small-enterprise` into `Small enterprise`
        -- Group legal entities into categories
        case when legal_nature in (
                    -- pass-through
                    'Sole Proprietorship',
                    'Privately Held Corporation',
                    'Publicly Traded Corporation',
                    'State-Owned Enterprise') then legal_nature
             -- group types containing key phrases
             when legal_nature like '%Limited Liability%' then 'LLC'
             when legal_nature like '%Partnership%' then 'Partnership'
             when legal_nature like '%Cooperative%' then 'Cooperative'
             else 'Other Type'
              end as legal_category
   from 'https://raw.githubusercontent.com/rfordatascience/tidytuesday/main/data/2026/2026-01-27/companies.csv'

To get the multi-level Sankey diagram, I needed to union all two sets of data—one just grouping by the company size with ‘Brazil’ as the source, and one grouping by size and legal category to count “flow” between them.

create or replace view companies_sankey as

   select 'Brazil' as source,
          company_size as target,
          count(*) as companies
     from companies
 group by all

union all

   select company_size,
          legal_category,
          count(*) as companies
     from companies
 group by all

 order by 3 desc, company_size, legal_category

I ordered by the third column because this translated between queries without having to change what that column was in the order by clause:

create or replace view capital_sankey as

   select 'Brazilian capital' as source,
          company_size as target,
          sum(capital_stock) as capital_stock
     from companies
 group by all

union all

   select company_size,
          legal_category,
          sum(capital_stock) as capital_stock
     from companies
 group by all

 order by 3 desc, company_size, legal_category